fix(KDP): Preserve original dtype for PASSTHROUGH features#30
Merged
piotrlaczkowski merged 6 commits intomainfrom Jul 30, 2025
Merged
Conversation
Co-authored-by: piotr.laczkowski <piotr.laczkowski@gmail.com>
Co-authored-by: piotr.laczkowski <piotr.laczkowski@gmail.com>
github-actions bot
pushed a commit
that referenced
this pull request
Jul 30, 2025
## <small>1.11.1 (2025-07-30)</small> * fix(KDP): fixing tests ([6326dbf](6326dbf)) * fix(KDP): formatting issues fixes ([6c60aed](6c60aed)) * fix(KDP): increasing package version ([ce0dbf3](ce0dbf3)) * fix(KDP): Preserve original dtype for PASSTHROUGH features (#30) ([82b6d7e](82b6d7e)), closes [#30](#30) * fix(KDP): update upload-artifact action to v4 in GitHub workflow ([68ee7c5](68ee7c5)) * Add preserve dtype layer and update passthrough feature handling ([a700bad](a700bad)) * Add pytest markers and improve test categorization for GitHub workflow ([06b5112](06b5112)) * Checkpoint before follow-up message ([47ec0ef](47ec0ef)) * chore: save last release version for recovery [skip ci] ([84e0b1f](84e0b1f)) * refactor(KDP): improving tests execution ([b9d237e](b9d237e))
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new
PreserveDtypeLayerto handle passthrough features while preserving their original data types or casting them to a specified type. It also updates the preprocessing pipeline to utilize this layer and adds extensive testing to ensure its functionality. Below is a summary of the most important changes:Feature Addition and Integration:
PreserveDtypeLayerinkdp/layers/preserve_dtype.pyfor preserving or casting input tensor data types. The layer supports serialization and deserialization for integration into Keras models.PreserveDtypeLayerinto the layer factory by adding apreserve_dtype_layermethod inkdp/layers_factory.py. This allows dynamic creation of the layer.kdp/processor.pyto usePreserveDtypeLayerfor passthrough features, replacing the previous approach that cast all features tofloat32.Testing Enhancements:
PreserveDtypeLayerin a new file,test/layers/test_preserve_dtype_layer.py, covering scenarios like preserving original data types, casting to target data types, batch processing, serialization, and integration into Keras models.test/layers/test_layer_factory.pyto include cases forPreserveDtypeLayer, ensuring its compatibility with the layer factory.test/test_processor.pyto validate the behavior of passthrough features with various data types (e.g., string, integer, float) and their preservation in the preprocessing pipeline.Configuration and Test Suite Updates:
micromarker inpytest.inifor categorizing the fastest tests, including those forPreserveDtypeLayer.test/layers/test_layer_factory.pyandtest/test_processor.py) to include themicromarker and additional pytest markers for better test categorization. [1] [2]These changes enhance the flexibility and robustness of the preprocessing pipeline by enabling precise handling of passthrough features while maintaining or transforming their data types as needed.